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with oligonucleotide probe*. 



Determinat»on of the formula of genom.c DNA. 
i; genome sequenong. by a hybridization with 
ol.gonucleotide probes (YU Patent ApplicaUon 
S7Q/87) envisages the use of 100000 oligonucleotide 
probes and the same number of hybridizations with 
6000000 of addressed sample-ck>oes on fitters in 
order to determine contents of oligonucleotide se- 
quences in each clone. The process presents im- 
provements in preparation of samples for hybnd.za- 
CMtion and improvements which enable one to follow 
<gene express.on by determining partial or complete 
O figment sequences of genomic DNA, mRNA or 
*"cDNA By b.nding fragments of genom.c DNA to 
^dtscrete particles (DP) of a microscopic size which 
<\J are recognizable in a step of reading experimental 
image, the necessity for accessed samples on ni- 
ters is d.spensed with and this drast.calty reduces 
OautomaticaWobotical component of the process and 
A allows miniaturization of the ent.re method from a 
mjevel of industrial installation to the level of laboratory 
instrument. Processes for binding DNA fragments to 
DPs recognizable in common reactions, allow elimi- 



nation of cloning, i.e. DNA amplification in the host 
cells and. in a process of library forming, need for 
formation of 6<XX>000 addressed samples in any one 
of the phases of a process for sequencing by hy- 
bridization. 
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a) Field of the Invention. 

The Resent invention belongs to the field ot 
molecular biology. 



b) Technical Problem 

The oenome size varies from 4 x 10' 
nucleotider* bacteria <E- coli) to 3 x 0' 
noc £oMes in mammals, inducing men. The deter- 

notogical challenge lor the science of the end of 
£eloth century, ft is believed that the ava.lab.Uty 
nSoS ^quence in genome would «use , 
mutative rise in medicine, biotechnology and fun- 
dam^tal bidogy itself. wou.d benenoen^ 
*fluence all fields of these sciences. As a contrast 
To T£U method which is no. effective enough 
to accomplishing this task, we claimed a method 
t teeing by hydridization ^ugos^nan Pa - 
" V^raooo 57(V87 and Amendment No. 4S21. 

orr^enTHowever. this method requests .ndustnal 
S^E- investments. The present soMjon 
relates to the technological .mprovements o a ba 
™c r^ndple claimed which perm" miruatunzation. 
£e££tf the investment costs and vnder ap- 
SoTo. IMS method and all other applications o 
oligonucleotide hybridization for determ.nat.on of 
genomic sequences. 



c) State of the Art 

The knowledge about parts or entire genomes 
on the level of primary structure as well as me 
possibility of following the inhentance usng th,s 
Nation are being increasingly recogn.zed as 
condTtions tor a more efficient and faster study 
Siprocw*. A part of expenmenta. .nves- 
ti Q °*ons w»l be replaced by computer research on 
eoTnces ow-ir-d. H might turn out that some 
^og^ Phenomena (evolutionary processes) w,. 
^ acUssible for study only though the ana.ys.s of 
genome sequences. 

T-o areas are recognizable m wh.ch there .s 
an increase of organized effort to f.nd methodolog- 
£l solutions wh,ch would allow the determ.nat.on 
ol primary genetical information. One is concerned 
° m detection of a large number of mapped 
SPymorpWc sites .n genomic ONA of -ndrvdual. 



family or population. In fact this project means 
^erminatST ». a defined par, pMh. = c 
sequence which represents a specrfic gerom* 
lor which we propose the name GENOG- 
^TZ other project deals with the dela- 
tion of the entire sequence of human and other 
^oL. k, extreme. « might mean the ^ten^na- 
ion of the sequences of genomes of most speaes 
7r«erest and in sufficient number of indnodualsjn 
each species. The first project has two bas* ap- 
p^ch^he detection of polymor ph* sequence 
STthe specifies RE 
a specificity which has hybridization wrth ONP. The 
U J£E»* has advantages 
number of polymorphic sites, since rt doesoot 
Xre the determination o, ONA fragment length 
Z£ is easily carried out on an amplrfied target 
£«£or amplified Ugafion-hybridization reacfioo.Jn 
order to obtain individual genetic mapswr* a ^ cm 
, resolution it is necessary to follow 5.000 to 10.000 

<*<*>« is envisioned as a mul- 
tipha^pS n^ping with , final goal ofdeter- 
S-X^d, sequence, ^ rr^ng 
, ajso makes use of sequence recognrbon by RE arvd 
nTJurement of DMA fragments lengths or by the 
r.eS£n o, contents o, 0* *yl**«~ 
Zu ONP. and sequencing itself, accord.ng to ex 
makes use of measurement ot 
ZS^S^^ ind.rectW. ^ the 
^TmentXermines sequence. Two , furtherap- 
Lwches tor determining expenmentally the orter 
oTn^cleobdes are being considered as well. One .s 
iT^Tand is based on the sequential removal ol 

reading of the sequence by means of speofic 
etection microscope. F.nally. a theory of an ap- 
^Tnas been developed in which the sequence 
<. ^amved a, directly by the experiment* , eter- 
,.inn ol the order ol nucleotides, but the con 
Tents oToNsTsLnd instead and men these data 
Te ^stormed into sequence information by com- 
Laion work (SBH). Presently, the only realistic 

"on by ONP ol the same kind used in the above 

""TSZtZ* and ,n all methods (except 
evela^y in microscopy), one has to operate wrth 
t hugt number of samples which, depending on 
L X and number of genomes, are to be pro- 
amounts to between fO and .0^ S.nce 
eacn sample is subjected to one or more .dent.cal 



50 



EP 0 392 546 A2 



or similar reactions, there is a problem of ways of 
and speed of performing of the great number of 
repetitiToperations as wen as a problem of ways 
and speed of gathering experimental information 
and storing it in computer memory. Obviously the 
solution for the first problem is a robotized process 
and the most efficient way for gathenng data is 
image analysis of experimental image. Speed of 
image analysis amounts already to one m,U,on to 
10 millions pixels per second and this allows dif- 
ferentiating of a 10 times smaller number of point 

shaped objects. 

Here we define -Informational Approach for 
study of genome primary structure, analyze its. 
characteristics and informational-technical requ.re- 
ments and give conceptual solutions for its major 
technological components which can substitute a 
miniaturized process for a massive robotized pro- 
cess using addressed samples. Basically, the .dea 
is to use the mixture of samples in a form of 
discrete recognizable particles. 

THEORETICAL ANALYSIS 



informational Approach on its Characteristics 



A common characteristic of three methods 
which for achieving different goals, make use of 
-mismatch tree" hybridization by ONP is the ex- 
perimental determination of the contents of a s.n- 
Qle some or almost all ONS in specific fragments 
of genomic DNA. These are the method of detec- 
tion ol polymorphic sites, method of forming or- 
qanized genome library (link-up) and SBH method. 
For the same targets there are methods which are. 
in an informational sense, based on the expenmen- 
tal determination of the position, i.e. the sequence 
of individual nucleotides or definite ONP. These 
methods are RFLP analysis, restrictional mapping 
and forming of organized libraries on the bas.s of 
restrictional pattern and sequencing by means of 
acrylamide gel. We define methods which make 
use of determination of the contents of ONS as 
-informational Approach". In this approach the 
same principle is used for both the determ.nat.on 
of a total sequence and of selected parts of 
qenome (GENOGRAM) and so, in general, this 
approach can be defined as "Informational ap- 
proach for a study and following of primary struc- 
ture of genome- It has several essent.al char- 
acteristics. 

The most essential feature of this approach is 
the possibility of acquiring experimental data 
(contents of ONS) in unpositioned samples. Since 
the sequence of elements is not the obiect of the 
measurements, nor is the physical distance of ele- 



ments, there is no requirement for ordered spatial 
disposition of samples, nor should they have a 
defined starting position, etc. 

The second characteristic is that contents of 

5 ONS can in principle be determined in a sample 
which in final experimental treatment occupy a 
micro volume or an area of optimal size. Since no 
physical separation is requested, such transport 
effects as diffusion, etc. are avoided. These effects 

,o usually impose the necessity for macro volume. 

The third characteristic (which is a conse- 
quence of the preceeding ones) is a considerable 
density of data bits per unit of experimental volume 
or area. It can be defined as the possibility o 

is achieving high degree of parallel acquisition of 
data. 

The forth positive characteristic in this ap- 
proach is me fact that data fraction, such as sam- 
ple position, order of elements in the sequence. 
20 starting and end.ng DNA fragments or order oper- 
- ations. does not have to be determined experimen- 
tally, but instead, it can be replaced with informa- 
tional computer processing of data which can be 
gained experimentally in a simpler and faster way^ 
2S The fifth feature is the increase in the number 

of experimental data bits (size cf the matrix) at the 
expense of reducing amount of work necessary or 
the acquisition of more sophisticated and smaller 
matrix. This means that the burden of the process 
3 o is placed on the data reading step or. in- other 
words, on the input of more data bits in computer 

merT These characteristics allow the formulation of a 
concept of a miniaturized, fast and frugal process 
35 for generation of necessary experimental d*a. 

Scientific and practical needs for determination of 
GENOGRAM and entire genomic sequences have 
the same informational-technological requirements 
We shall define data BITS as a discrete expen- 
.o mentally gained meaningful datum. Thus one data 
bit is a fact that the millionth nucleot.de is the 5th 
human chromosome A. Or that X fragment of hu- 
man genomic DNA contains Y ONS. A m.n.mal 
number of data bits for determination of entire 
.5 sequence of one mammalian genome is 3 billions 
or it is equal with the number of bp. S.nce the 
ultimate goal is to have all data bits stored in 
computer memory, for determination of a process 
speed it is important to know part ( /o) of the 
so necessary number of data bits which can be ex- 
perimentally collected and stored in a memory per 
unit of time. 

One can assume that in future, the t.me for 
determining the sequence of a complex qenome 
5S preferably cannot be longer than a year If a sec- 
ond is taken as a unit of time (one year has 3.1536 
x 10' seconds), then we may consider as accept- 
able to 'acquire all data bits in 10* to 10' seconds 
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H one chooses 10* second* <' 2 d «* s wtthOU ' 
. ^H^Tes a very favourable time span, then 

^ «*erv second, tt should be taken Into 
.^ticTtn^days denned In previous ana- s 
consideration . 5 to 10 times 

^au^slderahto p« o, this 
S^, s^useTtor effecting sequencing oper- 
Zs buTls used instead ,or samp* preparation. ^ 

^.tany data bits are necessary .or me 
, a , th8f GENOGRAM or a sequence 

^^1^ by 

ONS7 For sequence determination 
C r 0 « To 'senate genoT fragments (sample,, » 
^ Ihout tu> ONP are needed and these figures 
^ueT?0« dS- bHs. For GENOGRAM a, to* 
,^Tnon* sites or fragment, are necessanr « » 
icuM argued that 10» dots In a 
^JT'hL^ Instigated because this ng- *> 
^eT an^PP^ate nurnber o« genes. In that 
case GENOGRAM would need a compel veW 
8 "at number o. .ragmen,, in ^^'^ 
- ~ most of OKS wouW have to be deter 
mtaedf in Tsuch ertensive GENOGRAM. In * 

2 71«p* «• «* *• «™ ons - r nu T.irt 

nectary ONS reaches the order ot ^eraMens 
"rSnds. in practice. It is more MM » 
nlve^hONP with a length ot 7 or 8 bases and 
rnuT wfth t6.000 or 65.000 ONP. to possess a » 
SLTt detection o. the ^IcZZ 
nuance H is considered that the most etfiaen. way 
rr«o determine speci.ic pairs o. probe, and 
targets but instead, to try to get more in.onn.Uon 
bv a simple combination ol each probe w-th each 
f V o«t Based on the analysis ol this sort, it can be 
3 ^GENOGRAM about 1* da,a t*s are 
necessary. We can consider as reasonable to col 
^toTt* seconds (12 days, in.orma.ion tor .00 
£\ 000 GENOGRAMS. Roughly, it is the analyses 
o, ^ Patients pe, day or testing o. a samp* 
o, ToSo individua.5 o. certain populauon wrth-n a 
>Tot tv»o months. With such an approach as 
^ds th7requre-en,s o. mo.ecu.ar generics or 
«T.u*re. w7 reach again the same number o. 

of work necessary (or sequencing 

9e "^e we es.ab.ished .he .ac. ma. a mi.Uonth 
part o. an da.a bits has .0 enter memory .n one 
second we arrwe at a parameter ot the cruaal 
fiance .or this approach - how to produce and 
store a mill.on data bits per second. 

These huge technology, requ.rements tor 
nidong up a know.edge o. GENOME structure can 
^ accomplished by deve.op.ng a system .or qu.ck 



gaining o. necessary data bits. In the ««l W« 
one such sys.em which make, use , o MhJ. 
acteristics ol Inlormatlonal approach Is described. 



description , ot the Solution 

Phases ol inlormatlonal approach 

One can define tour phase. * 
oart ot mis approach, when the content o ONS Is 
c^ell bToNP hybddixaBon «jjr .«x*flj£ 
i^matton related to the content* ol ONS in specific 

££. ?g—c ° NA - ,nto rr , .s p Sn 9 - 

UKes place in which the entire sequence Is gen 

.r.utri or Darts thereof. These phases are. 

.rated or pans^ m . rW ng. 

^fCe'iepa^oo (synthesis. ONP bank. 

probe labeling) 

3. Hybridization 

4 Reading and storing daU bits. 
Three goals are denned which shoukJ be 
achJved eilC In each phase •^^^ 
„" , „ , whoto: samples and probes should be 
Z £ *L possible extent recognizable by posi- 
L^dTnaVes. addresses) and to the maximum 

, mere should be the toast *^ Z 

^oaen, hybridation reason >Jg * 
-reading" one million ONkxiw««- 
per second. 

" Samples as a system o. d.screte and recognizable 

particles 

The samples can be considered in two phases. 
„ -"p^^^^^sTn 

in ordered dot-btots. One can pose the que 

»«. use ol addressed samples be avcxoeo 
°Z fs esf^ly important in case when the sam- 
* ^ Jbe kepUor ^ forage; or 

can easily be be ^^^S^uUrtV 

«i niM<i fie to hybridize all ONP) on a k 
and many such hybridized spots .n separate 
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hybrid. zaUon areas. One would like to recognize 
.uThyoXtloo spots, even If they are no. sepa- 
^ wlpUfled genomic fragment, In tubes w,th 
ZT,!,^ hybodl.at.on dots w,th 

^TX"*** methods and t^hnlques lor 
eheni- .^nthesls o« DNA. the answer Is the use 
o^, 0( U as the aubstitut, . Coc **** tubes 
to keep the liquid aamples apart. A drop ol water 
^uZ»a the necessary number o. cop.es ofDNA 
foment given Is replaced by the aoM parte* 

drying the required number of copies ofthe aame 
£27.grnent attached to Its ^^of 
tjcle. can be looked upon as small beads of defi- 
^Vzeand shape, simile to the ones already in 
Z, TdiH^t appScations. * order to . IrnpMy 
description, the use ol discrete particles in hy- 
biUaaon reaction will be presented first starting 
£m physical* aep-ated and ampUflrf aamples 

obtained In a preparatory phase. 

So there is a library of genomic clones placed 
on rn.cro.iter pl.tes. One adds discrete P**""? 
each welf and binding reaction takes place result- 
ing ,„ DNA being attached to them Thus each 
MquM sample Is divided Into a certam number <rf 
D P VA«quotes of OPs from each well are mixed 
together end spread in the monolayer of reqtmed 
density. This is followed by the necessary number 
Sons, in this way one can 
areas (HA) similar to titters in dot blot procedure. 
Every DP represents one dot. and solid support a 
certain area of the fitter. One can Imagine a simple 
case in which each HA contains enough randomly 
displayed OPs mat each Cone from «* librvy « 
Rented at least once. The other HA, are repH- 
Ja but DPS are present in different places. Every 
HA can In principle be hybridized and reused as 
classic til ters. The main problem is to recogrvre 
o"*th the same clone in different HAThe tetter 
is probably very difficult to achieve with 10 ONP 
required for SBH. Also, the possibility o • perform- 
ing many hybridization reactions in parallel is lost. 
II only one HA is used. 

We see three principal ways of recogn.z.ng 
DPs which can be combined. 

t) Labeling with physical attributes of DP l.ke 
si Ie shape and color which can be differentiated .n 
a phase of reading, for instance, during image 

ana ' yS 2) Labeling with different combinations of 
ONs which can be recognized as such by hy- 
bridization with appropriate ONPs. Thus, out ol 20 
different ONs 3«t0 4 different combinations can be 
formed with 10 ONs each. By attaching each com- 
bination of ON to DP 3« 10* differently labeled OPs 
are obtained. They can be recognized by detection 
of the combination given in hybridization w.th 20 



ONPs which are complementary to ON i"^'™ 
3) Labeling (better recognition) by the use of 
certain fraction ol ONP. on all HA. This pnnciple Is 
the one used hi a link-up method. The requirement 
. |, that HAS can stand a testing with great number 
of ONPs In such • way that one part can be used 
tor recognition. Lehrach found that 100 ONPs hav- 
ng the same density a, 8- or (Hner probes lor 
SBH are suitable tor recognizing overlapped cos- 
, 0 mWs in entire genomic library. In >^»«^t 
Hon ol identical clones, not the overlapping ones. Is 
Quired in a mixture of defined number of d-lterenl 
clones. « the mixture Is simpler, all clones need 
not be mixed at once, but separate mixes with 
„ smaller number of clones can be Prepared and 
used to obtain subdivided HAS thus reducing the 
number of probes required. 

With the combination ol all three principles and 
the use ol subdivided HAS one can recognize 10 
20 Jmples required by GENOGRAM and 

sTait is obvious that the use of HA 
to parts and 100 'labels" per each pr.nc.ple allows 
^discrimination of this number of separate ^sam- 
ples. in this combination scheme, one *«to P£ 
„ LL e ,0000 samples with differently labeled OPs 

mat maximal use of the third ^« 6 £<°™> 
me need to prepare differently i"*". 0 ?*** 
increases the number of required hybdnd.zat.on, 

00 



Oligonucleotide probes bound to DP : Inverse OP- 
SBH 

Two ways of sequence determinat.on by the 

ol clones. ONPs are bound to a support in form ol 
dots A similar system lor fo.low.ng 
ferns' is being devetoped by Southerr . (pnva e 
<o Communication). The problem of these methods .s 
ma. for ONPS of suggested informat.onal length of 
8 bases in two cases, end 4 cases .n the th«d one 
either only a very specific in.ormat.on of M com 
p,ex sample (8-mer kxated by poly Southern 
« can be determined or the sample used tor hy 
45 orTdiz^ion must be a very short DNA. shorter than 
70 bp in the one case and shorter than 200 bp -n 
^ other case. These approaches are .mprac.ca 
tor sequence determination of the complex 
so genomes because more man 10 rmUjon. o ^^ hy- 
bridizations are needed. On the other hand, a sepa 
^ synthesis ol longer probes and their deposition 
c^to I Place in HA with 

is probably limited to 8-.0 mer lengths However^ 
6S by making use of recognizable DPs to wh,ch ohgo 
" are bound, it is possible to overcome these 

problems and such an inverse P^edure OSB H) 
L the potenUal allowing me application to 
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genome seouencin* The M ^"^"^ 

carrytog specific ^* T ^*^^T?' fcf^nrincio^e^^ and 
nets tor recoc^itioo according to pnnapie z, ma 
Jpecific functional o«K*xiucJec<ides of a 
lectod as in a case of oeoenrting clones bya 
selection (see part Direct proton o*tt>e^ 
moots of genorm: DMA on DP^By •combed 
synthesis (see- part on C*go ^J**^ 
ONPs of the particular length can be syrrthesoed in 
a cornparativery small rwnber ol £^ 
a manner thateacfiONPtsonOPwrtha specific 
known combination of 0690 targets, DP* prepared 
in this way are used for toning mc<>olayer HA 
<OHA ofic^hytxidaation area) as it would be the 
case if DMA fragments of the certain done were 
finked to them. In order that most of ONPs in OHA 
is represented with at toast one DP. » certain 
amount of redundancy is necessary (ca. 10" tomesV 
By the hybridization of OHAs with the ONPs wh»ch 
are complementary to oBgotargets from which 
combiantions for marking DPs are W^J*™* 
for the most complex OHAs toss than 50 probes 
and hybridizations are necessary, because 50 ohgc 
targets are sufficient for preparing more 10 com- 
binations of 25 targets in a smaH ~mber ofreac- 
bons), exact position of each ONP in each OHA * 
established. OHAs prepared in this way with m- 
formation on the position of each ONP * a 
-product' which can be used for very fast and 
simple sequence deterrnmation of the very tang 
DNA fragments. A maximum length (DNA complex- 
ity) (L) depends on a length of ONPs (N) wNch 
have been synthetized on DP in a given HA. In 
order to obtain average sequence of the fragment 
given in a form of 10 cosmids (sj^mc^. SF in 
nomenclature of SBH) L is not to be consKJerabfy 
greater than (VyiOO. For cosmid sequencing ti- 
mers are necessary, for YAC interest sequeoorvg 
13-mers. m a former case 4 million drfferently 
marked DPs are needed, in a later one 65 m.lUon. 
This is within a number of denes nec^ss^ for 
sequencing mammalian glomes by a direct SBH 
wrth 8-mer probes. H YAC ctones were used .n 
ASBH then for mammalian genome about 1-5 
thousand groups with 10 dirlerentty marked clones 
in each and the same number of ^'^ons 
would be needed, while in the usual SBH 10000 
hybridizations with groups consisting of 10 dif- 
ferently marked f>mers. The number of syntheses 
and operations is in th.s case the same for both 
methods 

A process for sequence determination using 
OHA would comprise several simple steps: 

1) random fragmentation of the sutficien 
mass o< a grven fragment of genome DNA to the 
lengths slightly bigger than ONPs finked to DP; 

2) marking of the fragments generated; one 
possibility is the use of terminal transferase and 



one fluorescent* marked «K*ec«de trir*osp^ 

3) drtcrirninative rrytxiefcrabon of the marked 
genome fragments and OHA; 

4) -image analysis- of microscope OHA im- 
5 age. m these four steps sequence of ONPs in the 

given long DNA fragment would have been ****** 
mined. By the computer processing of data accord- 
ing to algorithm and fxograrnmes for generating 
SF either a continuous sequence of the figment 
,o grven or the aequence in a ^ Jmrted 
number of SFs would have been regenerated. SBH 
can be successful* appted for rjeterrrwnation of 
one very important part of r^enome infonnabon. and 
these are the sites with porytwxphic sequence. 
, 5 Everything that one needs is a sufficient number of 
functional rwdeotides which in most cases have 
onry one target with complernerrtary sequence r in 
the given sample. For marnrnaEan genome, for this 
purpose the most suitable are 17-mers. On the 
jo average, each tenth 17-mer should have a com- 
p^Xary sequence in f J™^ 
genome. Wrth OHA containing 10* 17-mers (less 
than 1/100 of aa 17-mer*> about 10* 17-mers would 
be detected as positive. Since with 17-mers 17 bp 
„ can be read. OHA of this sort would allow 
-reading" of at least 10- bp. Since it is believed 
mat in each group of 1000 bp exists one VXyrrxx- 
ohic bp such a OHA would allow following about 
10* of poryrnomWc sites. By analysing individuals 
00 in several generations from several families a very 
dense genetic map (0.1 cM) could bedeterm.ned 
which would be useful, in a much simpler waythan 
RFLP markers, for following in a great number of 
individuals for various investigations. 
~ IBSH has several significant charactenstics: 

1) With a possibility of a great number o 
rehybrid*atk>ns OHA accepts the P^ertoes^ 
measuring instrument or stated in "format^a 
jargon of CHIP (artematively: sequencing card) 

<o which permits minimal sample processing 

2) A possibility for preparation of OHA of 
different complexity for sequencing fragments of 
afferent lengths. One can imagine OHA wtth 
200000 S.mers for sequencing 1-2 Kb fragments, 

< 5 OHA wrth 4 million 11-mers for sequencing cos- 
m.ds inserts of 50 kb. OHA with 65 million 13-mers 
for sequencing YAC inserts and. what is certainly 
mos^ttrac^. OHA with from 1 bi.Uon of 15-mers 
to 1000 billion of 20-mers for sequencing complete 
„ chromosomes, or genomes, or entire mRNA 
(cDNA) of specific tissue in only one hybridization 
reaction. It should be mentioned that no add.t.ona 
difficulty is imposed by the samples cons.st.ng of 
several shorter fragments (mRNA of certain tissue). 
55 However. ,n case of total mRNA (cDNA) of the 

specific tissue, problem can arise from the differed 
quantrty of each mRNA. One possible solution for 
mi s problem is to use sufficient mass of the sam- 
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pie (PCR appfcation) in order to bring the least 
represented mRNAs exists which has nothing to be 
finked to. 

3) There are no recrements tor 10-100 dif- 
ferent markers which are almost unavoidaWe .n 
usual S8H in order to decrease the number of 
separate hybricizations. H they do exist different 
markers can be sequenced by a simultaneous hy- 
bricfczation with sequencing card. 

4) A possWty for higWy specific labefing 
Oncorporation of 100 marked nucleotides by means 
of tormina! transferase) by means of which both the 
requirement for a number of ONP molecule per OP 
and the mass of DMA fragments being sequenced 
are decreased. For 10 s ONP moiecutes per one 
DP. in case of 15-mers. for 100 OHAs with redun- 
dancy of 10 times, it b necessary to perform 3000 
synthesis, each one on the present usual scale for 
the synthesis of ofkjorxjcteotide in an amount of 1 
mg. If 1000 molecules of ONP per OP are suffi- 
cient then with 400 syntheses on 10 mg scale 100 
OHAs with an 20-mers can be prepared. 

5) A possibility for achieving great accuracy 
in hybridization. In order to avoid forming a great 
number of SFs. it is necessary to have such a ratio 
between L and N that on the average, only each 
tenth ONP possesses complementary sequence in 
the given fragment of genomic ONA. On the other 
hand that means that chances for a larger number 
of sites with one non-paired rwcleotide are small 
which represents the most difficult case for dis- 
crimination. When U4» - 11000 then 
oligonucleotide probes are approaching dis- 
criminative possessed by unique genome probes. 

The main uncertainty of (SBH is rtytx>diz*tion 
with every complex probe, especially in case of 
using ONPs longer than 13 bases and genome 
fragments larger than a million bp. The bask: prob- 
lem is simultaneous hybrkiization with ONP having 
two extreme GC contents. Some solutions of this 
problem have been already given, for instance, 
washing in tetramethyl ammonium chloride. An- 
other problem is of a technical nature and has 
been already mentioned. It is the combination of 
oligontKleotide synthesis and linking of 
oligonucleotides already synthetized to the same 
OPs. Since these two reactions do not necessanly 
take place at the same time, solution of this dif- 
ficulty does not represent huge, non-solvable prac- 
tical problem. On the other hand, highly homolo- 
gous and simultaneously highly repetitive se- 
quences represent significant obstacles lor this ap- 
proach. In direct SBH with clones this problem has 
been solved by using libraries with clones of dif- 
ferent size. Because of these sequences (LINE, 
SINE) a much larger number of subfragments (SF) 
W.',i be formed in com pan son with a case of 
genome with random sequence. Solution of this 



problem is using as big VVL ratio as possible 
and/or using the existing and new information from 
clone systems and otehr methods for comparing 
generated SFs. * 



Direct preparation of the fragments of genomic 
ONA on DP 

to Depending on the fact whether the detection of 

- ONS by hybridization can be done on one or a 
targe number of molecules of the fragment given 
and on the mode of the fragrant amplification, one 
can define three possible ways of marking samples 
,s as direct mixes of OPs. Such an approach elimi- 
nates the need for preparing and maintaining of 
macro-separate and addressed samples. 

1) Detection on an inoWidual molecule by using 
DPS labeled according to principles 1 and 2 com- 
ae bination with recognition according to principle 3. 
The labeled DPs serve for discrimination of the 
parts of genome such as chromosomes. YACs. 
etc.. or individuals in parallel preparation of a num- 
ber of GENOGRAMS. The fragments of defined 
» part of genomic ONA will be attached as unit 
molecules to specifically marked DPs in separate 
reactions. DPs are mixed afterwards and used to 
form one HA. OPs carrying the same, or more 
often maximally overlapped fragments will be rec- 
30 ognized using principle 3 from the previous sec- 
tion. In contrast to using cloned fragments, the 
fragments are here rarely identical, so that groups 
of densely overlapped fragment are recognized. In 
this situation the complete contents are obtained 
js only tor a part of seq.^-ce shared by the group of 
fragments given. To use random groups of frag- 
ments obtained by ligation (ordering library in SBH) 
one needs to PCR or clone them without separa- 
tion of clones. The -separation- of necessary frag- 
40 ments required for GENOGRAM from the rest of 
genomic DMA is best accomplished by a PCR 
reaction. 

2) Amplifying by PCR. PCR can be used lor 
preparation of genomic library of fragments wrth a 

45 continual length determined by a success of am- 
plification. The length of 5 kb has been dem- 
onstrated. The procedure would require hgation of 
single to ends of genomic fragment mixture, the 
dilution of ligation products to single molecules per 

so volume, and then their use in separate PCR reac- 
tions for example, in microliter wells. In th.s way 
clones of starting fragments could be obta.ned .n 
vitro 

It is possible to see the implementation of PCH 
5S without the separation of individual fragments in 
addressed liquid samples. The requirement <s that 
rmcro droplets of an ampUf.cat.on mixture each 
containing either a single fragment or none are 
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propria* rnemorane* (t**f *• * mnpe ^*~ 
o^Hxjether with OP congtom«m. OP con- 

SaristW and should b* ••P*** 8 '""* 
SSuToP compos -*«J^ 

conciomerate. provkiw the way to 

p^rho^PYw^ *• ~~ 

wttch — recced tor. ft*** 

Wcrospher. formation '^^'^'^1 

me, (or formation of tat clroptat*. •» • ***** 

than * highly robotized process with high fi***)^ 
Every rrricrospoer, r^rwar*! -P^ «^ 

cation reaction similar to » '^^^T^ a^oKed 
amplication, the reaction H t«y «* »^**" d 
fragments to sphere* « P^rm^d « J^*** 
able reagent tor wtxch membrane a penr^ca^a 
used. The disruption of rnembranes and eertfw- 
SS, result, In a mix ol OP. 
fragment is represented in a sutfic*nt number of 
copies on an adequate number ol OPS 

3) The separation ol groups ol densely overlap- 
ping fragments on OPs capable ol selecting, in- 
stead ol amplification ol a single fragment One can 
irr-gine separation by the selection <^<"«£ 
hybridization. One should have 10 
eacTcarrying specific ON. ONs ^ the 
■engths wt** ensure their °«»^ «o*IM «• 
^genomic sequence- The 
number ol DPS win be explamed later. RarxJom 
fragments (longer than fir-fly reqwedl obtained 
from a large mass ol genomic DNA are subfectod 
to^nTs or 3 exonuc*«es These fragments 
are subsequently randomly cut and 
.ected. After selective t ^ Ma ^^TLZ^ 
covalence linking by ligation is performed, In this 
wTy^ach DP will have bound to itself those ONs 
wnlch are internally displaced tor the lentgh of 
single-stranded end containing ON given. The rec- 
ognition ol DPS with the "same- fragments can be 
done by labeling OPs by any one or a 
of principtes according to I and/or 2. and*r by 
using recognition without labekr* OPs according to 
print** 3. This selective procedure .s even more 
applicable to GENOGRAM where W ^ 
Samples per individual genome is 100-1000 times 
smaller. OPs would carry ONs ol selected se- 
quence complementary to the sequence ol frag- 
ments that ought to be examined. 

The procedure 1. is the most simple one in a 
technological sense, but the detection of hybridiza- 
tion on a single molecule is a difficult. sti« unresol- 
ved problem. The other two procedures presume 
many technically untested operations. On the other 
hand several different, theoretically possible solu- 
uon allow cor.dus.on that preparation ol defined 
fra -nents of genomic DNA. as the separate sam- 



ples, can be achieved In a DP mix. 



ON bank 



Procuration of ONs (ONPs) in a mixture 



The syntheses of large number of sep**eONs 
,„ b a awsiderat* task* standard'^ 

are used. However, the synthesis of ONs can be 
a«)reciabty speeded up by using r^binatxxi pno- 
£^.£0** enures a more 
c^aper syntheses ol smaller quantrt.es of indhndual 
f5 ONs. An even higher degree of «*»^*™~" 
be achieved by synthesis of suffioent 
Urge number of different ONs hav«g multiple ap- 
rfStions and which can be used by different tlab- 
SSSil. This principle ha, been aMM* « 
„ the synthesis ol inkers. "Kaplers ^ pnmers_ln 
this way ON bank would be obtained (an «tiatrve 
by Cricvenjakov. Drmanac. Beattie). Onecanask 
the question which bank wouk, be H" ™? ^' 
The answer Ses in the recognition of ON char- 
» acteristics that are the most suitable for maxx 

of sequence by hybridization, change* ««*WB 
DNA^equence and synthesis ol DNA fragment 
(amplified fragments, subclones, clones surtabtelor 
x SBhTc*. be performed even with very short 
ONPs. a 7 or even 6 nucleotides m lengttv ONPs 
about 20 bases long are suitable lor hybnd.zat.on 
tTtota. genomic DNA. Primer, for s,«e , speofic 
mutagenesis and PCR are usually 1S-20 mers. 
35 fTven^-rners are adive primers in PCfl. The f*°f** 
dure for DNA synthesis based on sequential reining 
o, short bkx*, is being developed We consjde 
the bank containing all possible 3-mers to Burners 
very useful for the following reasons: 0) menU ° r ^° 
«, areas of application. <ii> technotogica.ly accep^bte 
number of samples for bar* to contain. MJhe 
possibility of generating longer ol ONs from porter 
o^s (ligation, the use of dideoxynucleotides and 
tZinal transferase). Their total number ,s about 
< 5 90.000. According to Beanie's calculation a bank of 
8-mers (65.536 ONs, could be synthetized in tess 
than 6 months with total investment ol 3 m,llrf>n 
dollars. The cost ol materials and » 
bank having a stock of 1-2 mg of each ON is 2 
50 mitt*, dollars. The cost lor all ONs in «^««*^ 
,0ug each would amount totally about 10-20 thou- 
sand dollars, and that is some 1000 times less than 
present commerdal price. 

The possibility of the usefulness of making ON 
S5 bank on a solid support (perhaps even OP) which 
could be subsequently processed by mach.nes o< 
manually, affording specifically modified or longe 
ONs has been considered too (Beattie). The use of 
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tho mixture of differently marked ONPs • It Kef in 
SBH of in other methods, secures the posslbUUy of 
ONP synthesis as a mix. instead of forming H by 
mbdng. For instance. 64 3-mers are »ynlh«lzed 
each being placed on a large »mo^ol DP and s 
being marked differently, for example, by char 
acteristics (fluorescence) of the molecule mediating 
attachment of nucleotide to DP. The mixtures erf 
eoulmoUr amounts of all 04 3-mers are prepared 

-one 5-mer is continued in each part, in this way^ 
with 1088 synthetic reactions all 8-mers can be 
svnthetized in the mixture of 64 each. 

Similar principle could be applied for the syn- 
thesis of 10' DPS each carrying a different longer n 
ON (for examples. 16-mer). These are necessary 
for preparation of samples in a mix, according to a 
principle of selective hybridization (see above urv 
der 3): For Instance. 3 groups with 100 DPS in each 
are used Of possible 100 DPs have different phys- * 
teal characteristics) and in each group 100 different 
ONs are linked to DP. For 16-mers 2 groups are 5- 
mers 1 6-mers. The same 3 groups of ONs exist 
as f<*e, non-linked to DP. m 6 separate reactions ^ 
involving successive permuted coupling of DP - 
groups and groups of non-attached ONs. 8 millions 
DPs (a definite number of OP) with different ONs 
each having 16 bases would have been obUtfneo. 
in this way. with 300 different starting ONs and with 
several reactions of permutaions and linking, a nec- 
essary number of various ONs in a mixture (in th.s 
case DP mixture; can be prepared. 

Similar combinative synthesis can be applied 
for obtaining the bank of DPs recognizable on the 
base of oligotarget combinations, without or with 
functional oligonucleotide which can play a role for 
selecting fragment with a complementary sequence 
exists in the given sample of nucleic acids. Marking 
with this combination will be explained with the 
example of using a group of 36 different oligotar- 
gets and preparation of the combination with 18 
different targets. A maximum number of these 
combinations if they were formed in separate reac- 
tions would be 9 billion. However, with a compara- 
tively small number of separate reaction through a 
successive linking of the combinations w,th a 
smaller number of different oligotargets. it is possi- 
ble to obtain essential part of all combinations of 
18 H 36 oligotargets are divided into three groups 
of 12 in each and each group will contain 924 
combinations. After linking the first 924 combina- 
tions in the same number of separate react.ons. it 
is necessary to effect equtmotar mixing of all DPs 
and separate into 924 tubes with combinations 
from another group of 12 oligotargets. By repeating 
the cycle once more in 2722 reactions, a mixture 
with 750 millions of DPs with different combinations 
ol 18 oligotargets is obtained. Of course. DP with a 



specified combination of oligotargets means that 
each oligotarget Is present in a certain number 
(10*-10*) on a given DP, DP having the same 
combination in a certain number in a final mixture, 
which depends on the fact with which mass of non- 
marked DPs in the process was started. Thus, 
when less than 10% of combination is used a 
sufficient number is obtained for generating £ 
labeling the necessary number of clones for SBH 
cTmlmmalian genomes. Foe ^^'^ 
same clones. I.e DPs with the same combination it 
Ts necessary to carry out 38 hybridizations with 
oligotargets complementary to probes. 

For generating clones through a selection ins 
ultimate that all DPs in a mixture, carrying the 
same combination of oligotargets. possess In a 
certain number of copies the same o^ucleo*ae 
of the functional length selected. For 10 million 
clones, taking into consideration, that process ef£ 
ciency will be 10%, H is necessary to have 100 
millions of DPS with different cc^bination and dif- 
ferent functional oligonucleotides. For SBH that 
number can be within a range of 10*. 10- depend- 
ing on the lengths of DNA which are to be sequen- 
( 2, in one reaction. Besides, with ISBH one has to 
know which oligonucleotide Is bound to wh«* 
combi nation on the same DP. which ,. not the 
case with selective forming of clones. Also for 
,SBH one has to use a large amount of 
o oligosequences of the specific length. 

° The principle of preparation of these DPs wil 
be explained with the example of the synthesis of 
all 15-mers in three cycles. Basically it .s only an 
extension of the procedure lor 
» oligotarget combinations. In any case, one third f> 
mer) toTall 15-mers. The three groups should be 
formed, each containing 1024 combinations (this is 
a number of different 5-mers). starting from the 
smallest number of different oligotargets. In th s 
<o case, it is a part of combinations of 6 oligotargets 
from the group of 13 oligotargets. A total number ot 
oligotargets is 39. i.e. this is a number of neces- 
sary hybridizations for DP discrimination. In each 
first combination from the group the first 5-mer is 
, 5 added (for. instance. AAAAA) and so on until 
GGGGG is reached. At this moment, to a corona- 
tion from the first group non-labeled DPs are ad- 
ded, and then oligotargets and given 5-mer are 
iinked to DP: DPs are mixed and equimolanly d.s- 
w tnbuted into the combinations of the ^ ^ 
in this step, oligotargets are linked to DPs as 
separate molecules, and 5-mers affording lO-mer* 
in each of 1024 reactions of the 
1024 10-mers are synthesized, i.e. all 10-mers are 
55 synthesized. By repeating the same operations .n 
the third eye* all 15-mers in 3072 '^t,ons a e 
obtained in such a manner that one ^ ^ 
which 15-mer is on DP with which specific com- 
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blnation of ollgotargets. 5-m©rs are not necessarily 
added as the finished units, but Instead synthesis 
thereof can be executed in the given 3072 reac- 
tions. When a complete procedure is pedormed In 
5 cycles with 64 reactions, then totally only 320 
independent reactions are required. In this case. It 
Is necessary to divide 40 oligotargets In 6 groups 
with 8 combinations In each group with 4 oligotar- 
get*. These examples Illustrate the power of com- 
binative synthesis in which the number ol oper- 
ations grows by arithmetical and number of synthe- 
ses by geometrical progression. 

Taking into consideration possibilities of com- 
bined synthesis, bank forming and synthesis in a 
mixture, one can conclude that • number of syn- 
theses and manipulations is not necessarily large 
end. consequently. ON price should be direct*/ 
related to the mass. i.e. cost of the material. The 
use of OPs as a substitute tor dot-biots otters in 
this tense considerable advantages because H re- 
quires a lower amount ol ONP. For the same target 
density per area a much smaller amount of target 
per OP is necessary, than per dot.. The total area of 
HA Is also much smaller and this decreases the 
necessary amount of hybridization butter, i.e. ONP 
mass « DP diameter is 4am. then the area of its 
maximum section Is ca 10 um*. When dot area is 
1 mm 2 , the ratio is 1:10'. Based on the calculations 
that In forming of random monolayer t0-tc4d more 
OPs must be used in order that each one Is repre- 
sented at least once In HA, and that in this case 
utilization of space Is only 10%. ratio would be 
1:1000. Speaking in absolute numbers, in case of 
DP area of one HA would amount to 10x10 cm. 
while in dots woukJ be 1x1 m. Assuming that one 
HA can be used lor testing 1000 ONPs (mixture of 
10-100 ONPs x 100-100 washings), the total area is 
1 m J . vs. 1000 m J . In the first case necessary 
amount of each ONP in SBH can be calculated in 
the lotlowing manner. One ONP has a target in 
each tenth clone and since ten times more of OP is 
necessary because ol random sampling, the total 
number of OPs with which one ONP is hybridized 
equals the number of clones (10 millions). For a 
signal detection on one OP using CCD cameras, it 
is sufficient to perform labeling with less than 1000 
lluorescent molecules (private communication). If 
one supposes that in hybridization only 0.1% ol 
ONP is made use of. then one needs 10° ONM 
molecules tor hybridization with all clones. Since 1 
ug of 8-mer ONP contains about 3x1 0' 4 molecules, 
then such a mass of individual ONP is more than 
sufficient. T>>e dot system would probably request 
larger mass of the order of 1 mg. The dollar 
savings per one genome (or 100 GE NOG RAMS), 
according to Beattee's prices for a library, would 
be about a million US-S in a transition from dot to 
DP system 



Detection of ONs contents on a level of one DNA 
molecule 

If one rostrlcts himself to consideration of hy 
s bridization as a procedure for determination of ONs 
contents, the problem of detection of single target 
molecule ha* two components. The first Is the 
possibility (successfulness. efficiency, probability) 
of occurence ol the hybridization event with a sin- 
io gle target molecule (there can be an excess of 
ONP) and the second is the detection of the hybrid 
obtained. Since no efficient or simple procedure tor 
detection of tingle molecule hybridization has been 
developed so far, there I* no knowledge of this 
rs reaction either. One can assume that the event of 
single molecule hybridization occurs with a certain 
probability (in a defined. % of trials). The detection 
of such an event can be of two kinds, in the first, 
the detection of the signal is produced by the 
to marker on hybridizing probe (e.g. fluorescent mol- 
ecule, enzymatic activity), even H later amplified in 
various ways. In the second kind, the hybridization 
event is amplified itsetf. Its logic is the same one 
used in all exponential doublings m natural and in 
w vitro amplification reactions (cell division. DNA rep- 
lication, PCR. bgatiorvamplificabon reaction (LAR)). 
The total amount ol product is k x n«. where n is 
usually 2 and represents the amplification factor, c 
is the number of cycles and k the efficiency factor. 
» For GENOGRAM determination one can use LAR, 
since the basic requirement of the method is 
obeyed, and that is the previous knowledge of 
sequence or the small number of its variants. LAR 
is more difficult to apply to SBH of unknown se- 
as quence. ON which is the reporter of hybridization 
event (usually carries biotin) would have to be very 
short in order to use alt theoretically possible se- 
quences m a mixture (probably 4-5 mer). In addi- 
tion LAR has a problem of how to localize ligation 
40 product on dots or DPs on which It is formed. If the 
problem of local fixation is resolved, one can avcnd 
the requirements of the specificity of ligation reac- 
tion and reporter molecules. The scenario lor sim- 
ple amplification hybridization could took this w*y; 
45 DPs carrying the capability of binding ONPs 

having a specific chemical group on one end. and 
a single target are prepared. Then one hybridizes 
with an ONP that is both complementary and car- 
ries the chemical group. After discriminative hy- 
so bridization and washing, the reaction of denatur- 
ation and binding of ONP to DP is performed. Care 
should be taken of that this can bo "possible only 
with ONP which is hybridized to the target on a 
given DP.In this way DP having a positive hy 
55 bridization would form two targets. The hybr.d.za- 
tion reaction is repeated in which both the starting 
and complementary ONPs are used (synthesized 
target). In a new cycle of denaturation and b.nd.ng 
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«.. The wcood approach can potentially dlaolml- 
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phyaT.lv dineren, enfties. One can specula,, on 
rne ^ic *n«ia«,on o. chemical interactions on me 
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which wa, one m*de. I. nutat* on m« *u i 

MM without th* (nt^fltton ot th. td«itlit-.xp^ 

•»*«. th. INFORMATIONAL AP- 

PROACH pnwUM th. ot 

r***r*~nt. « th. .^r,M ot th. 

cofrVput. work. B*»d on the .««•■ '"^^f 

^►ph* town. powtt*. Of Mt*»*y 
pot**.. PftettC- proc^urw. ^ ««"«f**V 
poWnU.1 lor d««^ .«p*"-ntal r^^m^ 
That th. ««p*lrr-nul turtte. »»• »nd th. ONP 
™' „, by 

m. r-ctwry mw. ol th. Mmpte. 

b. probity rw^ « • cta.^ ~»TZ%ZZ 
in • dot-Wot tyth*" with th. vetMlt lor ctoo. 
cultivation which tr. «t lM«t 100 um wch p. 
ctoo, Th. tout volume lor 100 million wmpt.. 
wT« b^o. th. order o. W Wot. * 
,nvented mKre-hybrtd.«.tion ""^J^ 
mor. tmporunt Mo «-c. iMMtUI Mvtngs 
ruction, m th. numb* ot robot* opw.Bon. 
A robotic hwxJ with 10000 porting ling*, need. 
tooTo^rttlon. to m*. on. mm with .Or*, lion 
do,, WhO th. DP .yttem. . robotic land with 

piping «•» (10-1000) c«i P^orm 
.nalogoui Uik In • tingle oprK.tlon. Alt Nt cm 
J^TtT. mim.nrrtr.tion ot "g^omtc m«.«.tton» 
on, 0 . tlx. ot th. blggw l*xx«tory .n»trum.n» ot 

Th. OP »y«t«n in stsmc r«pr.»«>t$ the Iml- 
t.Hon ol multttud. ot Woctwmicl r.«tion. oc- 
curino «lmolttft.ouily within . ting* ell. Sp«»ciliC- 
Itv v*i dijcretWMM ot c.llul« rwction* *r. b«sed 
on wiym. Ktkx" «h°«> inlorm.tlon.1 properties 
„e imlMMd r-r. by DP>: Th. um ot OP> nquhM 
, n at le«st 10-tokJ increase in the number ol unit 
inlormation bits, but time «xl labour Investments 
(preparatory and robotic operations) lor obtaining 



Ih. comptet. data set ar. r«)oc*J »W»I «<".». 
T^^Sri ..ample, whteh Wlow v. WwW to 
^howhowco.c*, tran..^ th. e^ * °£«<^<' 
. in IMAGE ANALYSIS and thut. to mak. th. moat 
.md.it t*P- H • roboti^ oot 

o^. ^y^TT. cerulnty I. rop** hj*h 
me nrobabiUty thtt MCh don. It r»or.Mn.^ .t 

m . total numb, ol DPi. On th. oth. hand. But 

^o^tmlnrtot. ol potlthr. »<V^^. ™t 

out^J .«>«tmental pertormanc tovrtt. Ther.to«», 
^ £Xbr«i..«on ttgr-l ro^ng pro- 
c*dur.t mu« toU<«. th. Ubrarte. <^*^ °^ 
Urg. numb. Ot lr^rr*ntt which. Intum. rttort 
ol wuut. numb, ol thort. ONPt^Jhit it 
^nt in OENOORAM «PP»frton. In- 
*^V»P^nr ehoo*ng ol tOOOO p*tol 
p^r-ra a«J ONP.. It I. mor, .tflcl^n. » pMorm 
hvtortdtiattor. ol an amp«fi^ Ir^rr^nU with .11 
OW Th. It KKth. 

, th. rM»za»on that th. oowmg " ,n ^ m • 

o, „«w mutation., both ot which <*n b. ol eon«d 
•rMM d»gn°<«c valu.. By etching th. «npM«« 
„ to me IMAGE ANALYSIS o- ico^ word., by 
d^mg . vt-um. ot ^p^rr^ru. wor^^J. 
cMiibt. to obtain targ. number, ol detaiwa 
S^^OORAMS according to th. Mm. prtocip* » 
m <*termH»oon ot the ^tlr. O^^*^^' 

Th. UM Ot thM. Ch.^rttttct Ol INFOP.MA 
TONAL APPROACH provide lor M . M 
b»**» mW.turUitlon. • Orwt. tpwd * =om 
pT^-rmme proct^t raring 
oTtMrmg ol »ariou. and mor. compl.x d»ta biu^ 
K ^T/^t«ng to attempt to out^n. ^ ^com- 
^om aiulvttt the «lv«iug.t Md diMdvan 
^CONFORMATIONAL APPROACH .0. 

oX«ic t*^-oemg th. thrM prc<*dv..t 
tt^uUWwJENTAL APPROACH coning 
„ oTtoZt d.«.rm.n.ng method.. The , 

meioTuMd up K> now. wWch It b^don ^the 
unding ot th. portion by measuring the fcogth o. 
ONA fragment, ha. two r^rement. which are 
almott ctuinty .xcWIng It « . method o. choic. 
„ Th«. th. practical impos^billty ot minl.turlza- 
^ and r*ed to. ut. ol ^P«IW «^ n '' °' 
^ ONA. The oth* two <"^£££ 
"bh or other procedures us.ng the INFORMA 
TIONAL APPROACH h«. not bwn .Kpenment. ly 
55 v,nt*d no lar. do no. impose these '^"^^ 
The tunneling electron microscopy, used as tools 
to, dtnel -e«l.ng. is an inherently m,n,.tur.^ 
procedure which does not require amplif.cat.on ol a 
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DNA element On t* <*m hand, 
itmo^ of base by base from on* *ndo» OHA 
,raoment. fotowed by contnuaf "P*^**?^ 
and ***** registration m passage by t» enac- 
tor, lor p/acacal reason* almost certaWy requ*** 
iheuMVd ctatocfton on a level of one motoo^ 
Probably. M to very dmVa*. or may be W impoa- 
«Dle, to tynchponir* removal of *e tame 

can be imagined t*at f*s approach can b« "wv 
iaturtzed and mad* P*f»ee1. **e fr*** 1 01 
oVeased reacftons. multiple mierotubes can be 
mod anauring d au e» n <M to adcHton, separate* 
of removed rwckwbdes by a water now does not 
require macro-separation, as *iis is Ihe case * 
separation of DNA fragments, with an accuracy 10 
the level of one base ecrytarnide oat The man 
requirement of W approach b P**** 0 " and 
speed of detection of single events, especiaBy par- 
allel detection In targe numbers of m<rotube*- The 
question is, can the the use of lasers and fluores- 
cence iabeing combined with pixel based image 
analysis »0ow the acceptable data acquisition 
speed with r*>vprohib*ve*y oorr*>kw eo>pment_ 

In any case, both procedures are rerytog on 
achievements in physics. while INFORMATIONAL 
APPROACH is exdusrvety based on biochemical, 
molecular processes. That is so because in S8H, 
as Seated here, one can arrive at the experimen- 
tal image and IMAGE ANALYSIS with minimal 
technical requirements. Since there is an indVect 
detection of molecular reactions. SBH does not 
have to have atoms and since it does not use 
position informabon. SBH does not require any 
physical ordering of reaction allowing the use of 
amplified fragments 

AD methods have a common last step mcxjdmg 
image anafysrs of >xpenrnental image". The ques- 
bon is what is the ease of arriving at this analysis. 
It appears to us that SBH is more adapted, more 
efficient tor sequencing a large number of complex 
genomes. Due to its requ*ements for preparation 
of ONPs and DPs SBH is valuable, and perhaps 
more efficient for sequenc-xj on a large scale m 
comparison wrth other methods. The reduction of 
individual genomes on a common denominator - 
OrsiS. allows the use of informational work after 
image analysts for sequence generabon. The entire 
work in non-inforrnabonal approach is of the experi- 
mental character. 



CUIms 



1. Process for rJelermtnation of partial or entire 
nucleic acid sequence by the hybridization of the 
samples in a mixture, characterized in that multi- 
plied or synthesized or separated ONA or RNA 



molecule* in separate reactor* are bound to <**- 
Crete particles (DPs) of rnkjoacopi c size which can 
be olscrlmtoated according to phytteal and charni- 
cat characleristics thereof. DPt are mixed, then 
s twy are hybrkfaed with an kidWidual or with a 
oroup of rjrobes which are natural or multipSed or 
separated DNA or RNA molecules, and the resutt 
ol nybndasbon on the irxSvidual camples it de- 
lected either by an ordered flow of DPs one by one 
,o w^arer^ss^byadete^ 

rrxjnoieyer spread of OPs which permits detection 

by image analysis. ^ 

2. Process according to Claim 1. character- 
ized in that natural or muttipSed or synthesized or 
,s separated DNA or RNA rnolecules are inked « 
£p^reectk*«to*scr^ 
caTbe, but not necessarily, discriminated accord- 
ing *> physical and biochemical characteristics 
thereof. DPs are mixed, the mixture spread into 
» one big or several smaller separated areas on a 
seed support, after which DPs are fixed to the 
support. 

3 Process according to any of the preceding 
daims characterized m that from the same num- 
» ber of preparations of different o^*utfeotx*« of 
the known formula different mixtures are prepared 
using combinations of a certain number of starting 
ofejonucjeotides. each mixture is tx- "to* in separate 
reactions to DPs. and DPs carrying the same com- 
ae b-iation of c*gonucleotide» are recognized through 
a hytxktobon with <*go probes which are com- 
plementary to starting c^«ywuclec*ides- 

4 Process according to any of the preceding 
claims, characterized in that in a small number of 
- the reactions DP mixtures carrying specific com- 
oinabons of o&goruK^eotode targets, which are used 
tor recognition of DPs. are made. c*gonucteot»de 
taroets are divided into a specific number of 
oroups and combinations from each group are 
<o tormed and placed in the separate tubes with a 
specific number of different ottgx>nucleotide targets 
from me given group, then either identical DPs are 
added in the tubes with combinations or an 
eouimolar ration of OPs can be <ftscrim.nated .n 
^ each tube according to physical charactenstics 
thereof is added, binding of olig^rwcleotide targets 
to DP is carried out DPs from all reactions are 
mixed either eqvimolariry or in specific cases in the 
specific ratio, then they are divided into the tubes 
„ wrth combinations of ofigotargets from another 
rjroup. and the given cycle of mixing and drvxJ.ng 
OPs and binding the new combinations of ofigotar- 
gets to DPs is repeated as many times as there 
are the number of groups of ofigotargets. 
« 5 Process according to Claims 3 and 4. char- 

actertzed in that DP besides certain ccmb. nations 
of the o(.Qonucleotide targets or some other marker 
contains functional ol*gooucleot.oe of defined 
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length of tfw functional oficpnucleoede which fc» 
c ommo n tor al of OPs in * given reaction it 
s yiitfie sU e d . before, during or after toe twxfcng of 
the given otgo ta rget cc*ncw u j6on to OP or other 
marker, in each cycle in « -given lube with Vie 
specific oKgotarget combi na tfon in tfiat way that in 
a successive reaction of toe btodtog of the oKgotar- 
get. combtsnoons. or other markers, cononues the 
syntheses of a needed part of toe junctional 
oegonucfeottde on the part synthe siz e d in the pre- 
vious cycle entire given part which is synthesized 
In toe independent process binds to the part syn- 
thesized, or bound in toe previous cycle, that is. 
binds to the DP fa) the first cycle. 

6. Process according to Claims 1. 2 and 5. 
characterized in that the rtytoriolrabon surface 
(area) consists of sofid s upport with a fixed mon- 
olayer spread of OPs with ch a r a cteristics made fa) a 
manner that in DPs b represented the informative 
part of or aM possible olffeient ofcgonucteotides of 
certain length's, the p osi tion is determined of DPs 
with each and every functional nucleotide by hy 
bocfczation with ofigotargets which are used for for- 
ming of the corribinabons on the DPs. or in some 
other way. if the DPs are not labeled with the 
oftgotarget combinations. 

7. Process according to Claims 1 and 6. char- 
acterized in that a sufficient mass of the given 
nuclear acid sample whose total complexity is not 
too high for the functional oSgonucleoOde length in 
a given hybridization surface (area) is cut jn a 
random process in very short fragments, although 
longer than the functional oagonuc^eobdes. gen- 
erated fragments are then labeled, discriminative 
hybridization with the hybncizabonal surface <area) 
with the characteristics is performed, in the process 
of microscopic image analysis of the given 
hybriduabonal surface (area) is determined on 
which DP the positive hybridization did take ptace. 
obtained information, on the basis of the informa- 
tion from the given surface (area) on the position of 
the DP with the given functional ofcgooucieotkJe. is 
translated into content of the oligonucleotide se- 
quences in the given nucleic acid sample and 
finally by computer analysis partial or total nucleic 
acid sequence in the given sample is obtained. 

6. Process according to Claims 6 and 7, char- 
acterized in that for the deterrrwnafcon and tracking 
of the heredity of the targe number of the genomic 
or gene polymorphic sequences for ioentrfication of 
the person, determination of the relatedness or 
evolutionary distance, detection of the changes on 
the genome and genes, prenatal and postnatal pre- 
diction of the phenotype characteristics, determina- 
tion of the b*o*oo»caJ function of the individual 
genes or gene complexes by determination of the 
sequence, a certain fragment or total human DNA 
Of the person, or indrvidual ceH. or mRNA. or cDNA 



of the certaun ftssue or group of the tissues, is 
proc esse d and rrybrio lz ed with hybrickzation sur- 
faces that are con ta ining sufficient number of dif- 
ferent functional ofcgonucleotides. so that in the 

s largest number of cases have complementary se- 
quences on a single point in a given nucleic acids 
sample, and from pattern differences between sam- 
ples from indfeiduals. the poyhrxxphic sequences 
are determined, whose haptotype combinations in a 

so new sample could be determined by applying the 
same procedure. 

9. Process accaa si ng to Claim 1. character- 
ized in that ofigcoucietide probes in a certain num- 
ber of moles are bound to vtsuaty recognizable 

15 discrete particles (OP) that are different from prom 
to probe so that one can apply a mixture of 
o6gonucleo6de probes so that one can apply a 
rnixtLre of o6c*x>uc*eo6de probes, and rtytxicfaation 
event after suitable rrybrfcfczation washing is recog- 

20 rxzed as a rosette of discrete particles containing 
corresponding probe bound to discrete particle or a 
point on the sofcd support where a target containing 
complementary sequence is placed. 

10. Process according to Claim i. character- 
's tted in that for identification of the uncioned genes 

or gene families and parallel investigation of the 
place, tome and modulation of the total gene ex- 
pression by means of determination of the se- 
quence of the mRNA or cDNA prepared from a 

30 certain tissue, a certain cell cultures, a certain 
tissue at a certain stage of the ontogenic develop- 
ment or cell cultures or tissues after the influence 
of certain environmental agents, are hybridized with 
the sufficien number of a single or groups of the 

as c^gonucleotide probes of a length of a 4 to 12 
bases and on the basis of the detected 
oligonucleotide sequence content the partem 
(profile, stage) of the expression or relatedness of 
the genes is determined. 

40 tt. Process according to Claim 11. chare- 

tertzed in that for applications for determination of 
the partial or the total sequence, the genomic DNA. 
mRNA or cDNA library of the given biological sam- 
ple (DNA. mRNA or cDNA) are bound to the DPs 

«s and hybridized with the part or all of the 
oligonucleotide probes of the length of the 4 to 20 
bases, and on the basis of the detected content of 
the oligonucleotide sequences by means of the 
computer processing, a partial or total sequence of 

so the individual clones is obtained, artd thereby the 
sequence of the given nucleic acid sample. 

12. Process of forming of the library of the 
discrete particles (DPs) lor determination of the 
partial or the total nucleic acid sequence by hy- 

55 bridiration of the samples in a mixture according to 
Claims 10 and 11. characterized in that DPs are 
carrying a single molecule or a certain molarity of 
the same or mostly overlapped fragments o' the 
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aenot ^c desoxyribooocWc acid (°™>" nbonu - 
^T^d (RNA). so that the Sbrary conU»ns a 
t^Zribm of discrete particles with the same 

contents o( the sequences « a starbng 
tamrtes are represented, by using the mnrturecrf 
tunctionai oSg^ucteobdes 
^^qoences appear only ooce « are non- 
!™^t VTstart^g nucleic acid sample n wh,ch to 
^^uence is represented in a Urge number 
T^o^TbTrneans '« hybridization proce ss . 
tC* the nucleic acid fragments « performed 
TZ as their fixation to the DPs 

.. o. me DPS with the same h»cbona. 
olioonucleotides conztin fragments wrth the same^ 
^Sy^rtapped or very similar sequences .rem 

""^Process according to Claim 12. character- 
bed in that to the genomic fragments ofappro- 
oriate sizes are enzyrnaBcally bound on both ends 
Sesame o, different short fragment of the desox- 

add (DMA), followed by 
iorTTme fragments, so that the form,ng .s permrt- 
£? of M senate samples containing a sing* <* i 
£ ILlTand - vitro * 
,s performed by polymerase cha-n paction u*ng 
prints that are complementary to the «9** 
fragments, or by using of nbonude,c aad (RNA) 
pofyrnerases. if the Ugated fragments are promoter 

^^ocess accord.ng to Claim 12. character- 
teed in that the library formation of the discrete 
par.ic.es (DP) is performed by using amp.^on 
reactions in hich corxjlomera.es of the DPsby 
random process are enclosed wrth the certam 
amount of the amp.ificat.on <™^ r ° ^5™" 
none genomic desoxyribonucle,c acid (DNA) frag 
mCs or complementary DNA or ribonucle,c aad 
membranes impermeable for the macro- 
motecu.es. followed by an amp.ificat.on. DNA ,s 
Ted to the DPS. followed by disrupts of the 
membranes and conglomerates ^ ~£ 
dividual DPS mixture in which ma,onty of the CPs 
a .arge number ol copies of the same 

foment of *• 9^ WK 
^S^ets partc.es according to Claims i to 
,4 characterized in that they possess the same 
or different physical or chemical «nsUcs -d 
are containing combinations of the Afferent 
oUgonucteotides of the known formulas thatare 
,3sented either as an individual molecule ol 
e £hTthe certain molarity of each, so that g.ven 
oligonucleotide comb-nation serves lor d.screte 
pastes recognition by hybridizauon. or .n any 
other obvious way. 

16 Discrete particle (DP) conglomerates ac 



cording to Claim 14. characterized in that DPs in 
c^Srlrate have the same or 
chemical characteristics and are bound together by 
t^eTphysical or chemical bonds, thus enabhng 
dulling ,0 indrvidua. DPs the discrete 
pSUs between different cor^^erates are <ec 
^nized by size, shape, colour, chenucal prop- 
Xs or by oligonucleobde combinations or in any 

other obvious way. rnPsl 
,7 The mixture of the discrete Pa^.<°^ 
according to Claims 3 and 6. crararferlzedjnthat 
OP * the mixture contains one functional 
o^nucLde as a single molecule, or in a certa," 

o^nuSeotiT possess the same phys^or 
chemical characteristics, anc . d^rete r^cles 
containing a different otigonucteot.de can by pnys 
STcnemica. characteristics, be 
Sent by size, shape and colour, or can corrtan 
IZenToi^onucteotide combination, or in any oth- 

m, TZZ* DNA fragments. cDNA. cRNA 
mo.ecu.es and their sequences ^racterUed.n 
c mat thev are identified, or .solated. or that tne.r 
5 SSuSS is determined, by using processes ac- 
cording to Claims 1 to 14. 
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